Application load times by caching shader binaries in a persistent storage

ABSTRACT

A method for compiling a shader for execution by a graphics processor. The method comprises selecting a shader for execution. A key is computed for the selected shader. A memory is searched for a copy of the computed key. A shader binary stored in the memory is passed to the graphics processor for execution if the copy of the computed key is located in the memory. Otherwise, the shader is compiled to produce the shader binary for execution by the graphics processor and storing the shader binary in the memory. The shader binary is associated with the computed key and the copy of the computed key.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application claims the benefit of Provisional Application No.61/585,620, filed on Jan. 11, 2012, titled “GRAPHICS PROCESSOR CLOCKSCALING, APPLICATION LOAD TIME IMPROVEMENTS, AND DYNAMICALLY ADJUSTINGRESOLUTION OF RENDER BUFFER TO IMPROVE AND STABILIZE FRAME TIMES OF AGRAPHICS PROCESSOR,” by Swaminathan Narayanan, et al., which is hereinincorporated by reference.

TECHNICAL FIELD

Embodiments of the present disclosure relate generally to the field ofgraphics processing and more specifically to the field of improvedshader binary caching and execution for efficient graphics processing.

BACKGROUND

High level graphics languages (e.g., OpenGL and DirectX) allowapplications to specify the execution of particular shaders. Shaders areinstruction sets that define how certain pieces of geometry or fragmentsare processed by a graphics processor. These shader instruction sets canbe quite long and detailed in what they do, and often execute millionsof times. In order to enable execution of these shaders on graphicsprocessors, a compiler is employed. An exemplary compiler takes aninstruction set written with a programming language (e.g., C programminglanguage and other similar programming languages) and compiles theinstruction set into a shader binary code or microcode that can beexecuted on the graphics processor.

One or more of these shader instruction sets may be compiled together toform an entire execution pipeline or program object. An exemplaryprocess by which an application selects shaders and compiles theselected shader sources into binaries and links them together into aprogram object may require an unbounded amount of time. An exemplarycompiler may require an extensive amount of optimization time whilecompiling and linking intermediate results and/or a final output.

Shader compilation times on mobile hardware can take a significantportion of frame render time. For example, on an application, such as avideo game, running at 60 Hz, the frame render time is roughly 16 ms.When the application compiles a handful of shaders, which take onaverage 3-5 ms to compile on current mobile hardware, the shadercompilation time can easily exceed the frame time and unfortunatelycause visible stuttering on the screen.

An application attempting to compile shaders during runtime risks framehitches or a gap of time when rendering stops or slows as shaders arecompiled and programs linked together. Such visible stuttering or framerate hitches are undesirable. Applications may attempt to get aroundthis by compiling in-between runtimes, such that the results of thecompile or link are not required for immediate execution. Despite suchtiming efforts, there are states or contexts that may change in 3Dgraphics, requiring one or more shaders to be recompiled during runtime.So even if an application attempts to compile and link all requiredshaders ahead of time (such as in-between levels of a video game) it isstill possible for the application to require shader recompiling duringruntime.

It would also be difficult for an application vendor to supply a shaderbinary library that contains all of the possible binaries that wouldneed to be stored so as to avoid compiling. Such an exemplary effort mayresult in a binary library containing hundreds of thousands of possibleshader binaries to supply shader binaries for every possibleconfiguration and shader combination possible. Even so, shouldunexpected changes occur, the stored binaries would then be out of date.Shader binaries will need to be recreated whenever the graphicshardware, application, or graphics drivers change so that the recompiledshader binaries are updated. In other words, a large binary library willnot provide the required flexibility to update the executable shaderbinaries to reflect any changes to hardware and software.

SUMMARY OF THE INVENTION

Embodiments of this present invention provide solutions to thechallenges inherent in efficiently compiling shaders during runtime.According to one embodiment of the present invention, a method forcompiling a shader for execution by a graphics processor is disclosed.The method comprises selecting a shader for execution. A key is computedfor the selected shader. A memory is searched for a copy of the computedkey. A shader binary stored in the memory is passed to the graphicsprocessor for execution if the copy of the computed key is located inthe memory. Otherwise, the shader is compiled to produce the shaderbinary for execution by the graphics processor and storing the shaderbinary in the memory. The shader binary is associated with the computedkey and the copy of the computed key.

In a computer system according to one embodiment of the presentinvention, the computer system comprises a processor, a graphicsprocessor, and a memory. The memory is operable to store instructions,that when executed performs a method for compiling a shader forexecution by a graphics processor. The method comprises selecting ashader for execution. A key is computed for the selected shader. Amemory is searched for a copy of the computed key. A shader binarystored in the memory is passed to the graphics processor for executionif the copy of the computed key is located in the memory. Otherwise, theshader is compiled to produce the shader binary for execution by thegraphics processor and storing the shader binary in the memory. Theshader binary is associated with the computed key and the copy of thecomputed key.

In a computer system according to one embodiment of the presentinvention, the computer system comprises a compiler, a memory, agraphics driver module, and a graphics processor. The compiler isoperable to compile and link shader source code to create a shaderbinary. The memory is operable to store a plurality of shader binaries.Each shader binary is paired with an associated key. The graphics drivermodule is operable to select one or more shaders for execution by thegraphics processor and to compute a key for a selected shader, and isfurther operable to search the memory for a copy of the computed key. Ashader binary is passed from the memory for execution by the graphicsprocessor if the copy of the computed key is located in the memory.Otherwise, the compiler is operable to compile and link the shader tocreate a shader binary for execution by the graphics processor andstoring the shader binary in the memory. The shader binary is associatedwith the computed key and the copy of the computed key.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be better understood from the followingdetailed description, taken in conjunction with the accompanying drawingfigures in which like reference characters designate like elements andin which:

FIG. 1 illustrates an exemplary schematic block diagram of a graphicsrendering system with a persistent cache for persistent storage ofshader binaries in accordance with an embodiment of the presentinvention;

FIG. 2 illustrates an exemplary schematic block diagram of a graphicsrendering system with a persistent cache for persistent storage ofshader binaries and corresponding keys in accordance with an embodimentof the present invention;

FIG. 3A illustrates an exemplary schematic block diagram of a graphicsrendering system with a persistent cache for persistent storage of ARBassemblies and shader microcode in accordance with an embodiment of thepresent invention;

FIG. 3B illustrates an exemplary functional block diagram of a graphicsrendering system with a persistent cache for persistent storage of ARBassemblies and shader microcode in accordance with an embodiment of thepresent invention; and

FIG. 4 illustrates an exemplary flow diagram illustrating steps of acomputer implemented method for compiling a shader for execution by agraphics processor in accordance with an embodiment of the presentinvention.

DETAILED DESCRIPTION

Reference will now be made in detail to the preferred embodiments of thepresent invention, examples of which are illustrated in the accompanyingdrawings. While the invention will be described in conjunction with thepreferred embodiments, it will be understood that they are not intendedto limit the invention to these embodiments. On the contrary, theinvention is intended to cover alternatives, modifications andequivalents, which may be included within the spirit and scope of theinvention as defined by the appended claims. Furthermore, in thefollowing detailed description of embodiments of the present invention,numerous specific details are set forth in order to provide a thoroughunderstanding of the present invention. However, it will be recognizedby one of ordinary skill in the art that the present invention may bepracticed without these specific details. In other instances, well-knownmethods, procedures, components, and circuits have not been described indetail so as not to unnecessarily obscure aspects of the embodiments ofthe present invention. The drawings showing embodiments of the inventionare semi-diagrammatic and not to scale and, particularly, some of thedimensions are for the clarity of presentation and are shown exaggeratedin the drawing Figures. Similarly, although the views in the drawingsfor the ease of description generally show similar orientations, thisdepiction in the Figures is arbitrary for the most part. Generally, theinvention can be operated in any orientation.

Notation and Nomenclature:

Some portions of the detailed descriptions, which follow, are presentedin terms of procedures, steps, logic blocks, processing, and othersymbolic representations of operations on data bits within a computermemory. These descriptions and representations are the means used bythose skilled in the data processing arts to most effectively convey thesubstance of their work to others skilled in the art. A procedure,computer executed step, logic block, process, etc., is here, andgenerally, conceived to be a self-consistent sequence of steps orinstructions leading to a desired result. The steps are those requiringphysical manipulations of physical quantities. Usually, though notnecessarily, these quantities take the form of electrical or magneticsignals capable of being stored, transferred, combined, compared, andotherwise manipulated in a computer system. It has proven convenient attimes, principally for reasons of common usage, to refer to thesesignals as bits, values, elements, symbols, characters, terms, numbers,or the like.

It should be borne in mind, however, that all of these and similar termsare to be associated with the appropriate physical quantities and aremerely convenient labels applied to these quantities. Unlessspecifically stated otherwise as apparent from the followingdiscussions, it is appreciated that throughout the present invention,discussions utilizing terms such as “processing” or “accessing” or“executing” or “storing” or “rendering” or the like, refer to the actionand processes of a computer system, or similar electronic computingdevice, that manipulates and transforms data represented as physical(electronic) quantities within the computer system's registers andmemories and other computer readable media into other data similarlyrepresented as physical quantities within the computer system memoriesor registers or other such information storage, transmission or displaydevices. When a component appears in several embodiments, the use of thesame reference numeral signifies that the component is the samecomponent as illustrated in the original embodiment.

This present invention provides a solution to the increasing challengesinherent in compiling and linking shader source code to produce shaderbinaries at runtime. Various embodiments of the present disclosureprovide an exemplary persistent cache memory that stores shader binariesand associated keys. As discussed in detail below, when a graphicsdriver selects a shader instruction set for execution by a graphicsprocessor, a cache is searched for a key associated with a shader binaryfor the selected shader instruction set. If the key is found in thememory, then the corresponding shader binary is sent to the graphicsprocessor for execution, otherwise, the shader instruction set is sentto a compiler for compiling and linking to create a shader binary forexecution by the graphics processor and storage in the memory.

Improving Application Load Times by Caching Shader Binaries in aPersistent Store:

FIG. 1 illustrates an exemplary graphics rendering system comprising ashader instruction set 102 (hereafter referred to as a shader), agraphics driver module 104, a compiler 106, a cache 108, and a graphicsprocessor 110. As discussed herein, an exemplary shader 102 may compriseshader source code written with a high-level programming language, suchas C programming language or other similar languages. In anotherembodiment, a plurality of shaders 102 (e.g., a program object) may becompiled, linked and stored in the cache 108. In one exemplaryembodiment, the cache 108 is a persistent memory that retains storedshader binaries and associated keys between runtime sessions so thatpreviously compiled and linked shaders are immediately available (in thecache) the next time they need to be executed. In one embodiment, anexemplary cache 108 for a mobile computing system may be approximately2-8 Mbytes in size, while an exemplary cache 108 for a desktop computingsystem may be approximately 64 Mbytes. Other cache sizes are alsopossible and are within the scope of this disclosure.

As discussed herein, some mobile applications, such as a WebGLcompatible browser, may compile a significant number of shaders during asingle runtime of the application. In one embodiment, each shader binaryfor a mobile computing system may be 1-2 Kbytes in size. This may causethe persistent cache 108 to fill up. One solution is to track the usageof shader binaries and to delete any old entries based on a leastrecently used (LRU) cache storage algorithm. Another solution may be touse a coarse ring buffer scheme using a pair of persistent files. Forexample, when one of the persistent files is filled up, the otherpersistent file may be truncated and new entries appended to thatpersistent file.

In one embodiment, shader binaries stored in a persistent cache may becompressed (e.g., RLE compression) beforehand. In one embodiment, theshader binary is compressed immediately before placement into the cacheand decompressed immediately after removal from the cache.

As illustrated in FIG. 1, a selected shader (or a plurality of shaders)102 is received by the graphics driver 104. After receiving the shader102, the graphics driver 104 searches the cache 108 for a shader binarythat is a compiled and linked version of the selected shader 102, readyto be executed by the graphics processor. As illustrated in FIG. 1, ifthere is not a matching shader binary in the cache 108, the graphicdriver 104 forwards the shader 102 to the compiler 106 for compiling andlinking to produce the desired shader binary. This shader binary isforwarded to the graphics driver 104, where it is then forwarded to boththe graphics processor 110 for execution and to the cache 108 forpersistent storage. In one exemplary embodiment, the cache 108 may besearched for a desired shader binary by searching the cache 108 for akey that is associated with the desired shader binary. As discussed indetail below, a shader binary paired with an associated key is stored inthe cache 108. As also discussed herein, when a shader is selected forexecution, a key is computed that is associated with the desired shaderbinary and is used to search for a matching key stored in the cache 108.In other words, the computed key is used to search the cache 108 for amatching key paired with the desired shader binary.

FIG. 2 illustrates an exemplary functional block diagram of a graphicsrendering system operable to produce a shader binary for execution by agraphics processor and the computation of a key associated with theshader binary. As illustrated in FIG. 2, an exemplary graphics driver202 provides shader arguments to a complier 204 and a compute key block206. The compiler 204 receives the shader arguments and shader text thatare used by the compiler 204 to compile and link the shader to produce ashader binary. As also illustrated in FIG. 2, the compute key functionblock 206 receives the shader arguments and shader text along with agraphics driver version, graphics processor type/version, and a compilerdriver version to produce a key that is associated with the producedshader binary. In other words, a shader binary is based upon thecorresponding shader arguments and shader text that are also dependentupon the current graphics driver version, compiler version, and graphicsprocessor type/version.

In one embodiment, a cache stores shader binaries and associates eachshader binary with a key. In one embodiment a key size and hash functionmay be used to produce a key that may be chosen such that a probabilityof collisions is kept extremely low. In one embodiment, a 64 bit key maysuffice as a number of possible shaders that a typical application maycompile for execution number at the most in the tens of thousands. Inone embodiment, an exemplary key is computed using a hash function on asource shader string and the shader arguments that are also passed tothe shader compiler. These arguments are computed internally by thegraphics driver using the graphics driver's current state. The sameshader instruction set may be compiled using different compilerarguments, resulting in multiple key/value pairs being added to thecache as graphic driver states change.

In one embodiment, as discussed herein, the cache 210 may also contain aglobal key that is computed at runtime based on a current graphicsdriver version, current compiler version, and other hardware relatedstates. The global key may be computed at graphics driver startup andcompared to a global key previously stored in the cache 210. If there isa mismatch between the previously stored global key and the new globalkey, the cache 210 is out of date and all stored shader binaries areinvalidated. When stored shader binaries are invalidated, a shaderselected for execution will need to be compiled, even if a copy of theshader binary is stored in the cache 210 (in other words, the storedshader binary is invalid). Such global keys may be used to ensure thatonly the latest updated shader binaries are used by the application.

As illustrated in FIG. 2, the shader binary and associated key areforwarded together to the cache 210 for storage. In one embodiment, thecache 210 is a persistent memory that retains its saved contents fromprevious runtime sessions. As illustrated in FIG. 1, the shader binarymay be forwarded to a graphics processor 110 for execution. To ensurethat the shader binary has not been corrupted, a compute checksumfunction block 208 computes a checksum that is also stored in the cache210 along with the shader binary and key.

In one embodiment, a checksum may be used to ensure that a shader binarystored in a persistent cache 210 is uncorrupted. As noted herein,comparing a computed key to a key stored in the cache 210 may be used toensure that the previously stored shader binary associated with thestored key is still valid (that there have not been software or hardwarechanges) while a checksum is used to ensure that a stored shader binaryhas not been corrupted due to copy errors, etc. In other words, a key isused to ensure that a desired shader binary selected is the correct oneand is up to date, while the checksum is used to ensure that there areno errors in the cached shader binary.

FIG. 3A illustrates an exemplary graphics rendering system 300. Thegraphics rendering system 300 illustrated in FIG. 3 comprises a shader302, a graphics driver 304, a compiler 306, a cache 308, and a graphicsprocessor 310. In one embodiment, a shader 302 is selected by a graphicsdriver 304 for execution by the graphics processor 310. As noted herein,before the shader is executed, the shader must be compiled and linked bythe compiler to produce a shader binary. In one embodiment, rather thancompiling and linking the shader to produce the required shader binary,the desired shader binary may be retrieved from the cache 308 and passedto the graphics processor 310 for execution. As discussed herein, whenthe compiler 306 compiles and links a shader to produce a shader binary,the paired shader binary and corresponding key are stored in the cache308 for later retrieval.

In one embodiment, GLSL shaders used in applications are compiled toproduce an intermediate compilation using ARB assembly code, which iscompiled itself to produce an executable using shader microcode orbinary. In one embodiment, the ARB assembly and the shader microcode orbinary are stored in the cache 308. In one exemplary environment, anOpenGL graphics rendering API supports user-supplied ARB assemblyprograms and fixed-function shading, and these are all cached in thecache 308 for later retrieval.

In one embodiment, the desired shader binary (e.g., an ARB assembly andshader binary used to produce the desired executable) retrieved during acurrent runtime session was stored in the cache 308 during a previousruntime session. As illustrated in FIGS. 3A and 3B, a shader is passedto the compiler 306 and an ARB assembly is returned to the graphicsdriver 304, after which, the graphics driver 304 passes an ARB ASM tothe compiler 306 to produce a shader microcode which is executed by thegraphics processor 310. In one embodiment, the ARB assembly and shaderbinaries are compressed to minimize a required footprint in the cache308 and to possibly further reduce load times. In one embodiment, RLEcompression is used. As discussed herein, a checksum may also be used toevaluate cached shader binaries to prevent execution problems should thestored shader binary be corrupted (e.g., due to abnormal processtermination or due to improper file locking).

FIG. 3B illustrates an exemplary functional block diagram of thegraphics rendering system 300 illustrated in FIG. 3A. In FIG. 3B, ashader 302 is selected by a graphics driver 304 for execution by agraphics processor 310. As illustrated in FIG. 3B, the shader 302 isselected by the graphics driver 304 during a first phase 304-A. Duringthe first phase 304-A, the shader 302 is passed to a frontend compiler306-A for compiling. The frontend compiler 306-A returns an ARB assemblyto the graphics driver 304 during the first phase 304-A. The ARBassembly is passed from the first phase 304-A to the second phase 304-Band forwarded to the backend compiler 306-B for further compiling. Asillustrated in FIG. 3B, the backend compiler 306-B returns a shadermicrocode for execution by the graphics processor 310. In oneembodiment, the backend compiler 306-B may produce a shader microcode orbinary.

In one embodiment, as illustrated in FIGS. 3A and 3B, the first andsecond graphics driver phases 304-A, 304-B are performed by the graphicsdriver 304. In one embodiment, the frontend compiler functionality(306-A) and the backend compiler functionality (306-B) are implementedwith a single compiler 306. As further illustrated in FIG. 3B, when theARB assembly and the shader microcode are produced, the ARB assembly andshader microcode and associated keys may be stored in the cache 308 forpersistent storage. In other words, separate unique keys for the ARBassembly code portion and for the shader microcode are created andstored.

FIG. 4 illustrates an exemplary flow diagram illustrating a process forcompiling a shader 102 for execution by a graphics processor 110 inaccordance with an embodiment of the present invention. In step 402 ofFIG. 4, a shader 102 is selected for execution by a graphics processor110. In one embodiment, a single shader is selected for execution. Inanother embodiment a plurality of shaders are selected for executiontogether as a program object.

In step 404 of FIG. 4, a key is computed for the selected shader. In oneembodiment, as discussed herein, a key is computed based upon shaderarguments, shader text, current graphics driver version, and a currentcompiler driver version. The computed key is therefore associated withthe desired shader binary. In step 406 of FIG. 4, the computed key iscompared with keys stored in a cache 108. As discussed herein, duringstep 406, the computed key is compared to the stored keys to determineif a key associated with the desired shader binary is stored in thecache 108. If the desired shader binary is in the cache 108, theassociated key will match the computed key.

In step 408 of FIG. 4, a determination is made as to whether or not thedesired key is found in the cache 108. In other words, is there a matchbetween the computed key (that is based upon the desired shader binary)and a key previously stored in the cache 108? If a key in the cache 108matches the computed key, the process continues to step 410 of FIG. 4.However, if the desired key is not found in the cache 108, the processcontinues to step 416 of FIG. 4.

In step 410 of FIG. 4, a shader binary associated with the stored keythat matches the computed key is retrieved from the cache 108 and achecksum is performed on the retrieved shader binary. In step 412 ofFIG. 4, a determination is made as to whether the checksum passes. Ifthe checksum passes, the process continues to step 414 of FIG. 4. If thechecksum does not pass, the process continues to step 416 of FIG. 4. Instep 414 of FIG. 4, the shader binary is passed to the graphicsprocessor 110 for execution.

In step 416 of FIG. 4, the selected shader or plurality of shaders iscompiled and linked to produce a shader binary. In step 418 of FIG. 4,the shader binary is passed to the graphics processor 110 for execution.Lastly, in step 420 of FIG. 4, the shader binary and associated key andchecksum are stored in the cache 108. As discussed herein, after theshader binary and the associated key and checksum are stored in thecache 108, the shader binary will be available for execution by thegraphics processor 110 during future runtime sessions without having tocompile and link the shader again.

Although certain preferred embodiments and methods have been disclosedherein, it will be apparent from the foregoing disclosure to thoseskilled in the art that variations and modifications of such embodimentsand methods may be made without departing from the spirit and scope ofthe invention. It is intended that the invention shall be limited onlyto the extent required by the appended claims and the rules andprinciples of applicable law.

What is claimed is:
 1. A method for compiling a shader for execution bya graphics processor, the method comprising: selecting a shader forexecution; computing a computed key for the selected shader; searching amemory for a copy of the computed key; and passing a shader binarystored in the memory to the graphics processor for execution if the copyof the computed key is located in the memory, otherwise compiling theshader to produce a shader binary for execution by the graphicsprocessor and storing the shader binary in the memory, wherein theshader binary is associated with the computed key and the copy of thecomputed key.
 2. The method of claim 1, wherein compiling the shader toproduce a shader binary comprises computing an associated key to bestored with the shader binary in the memory.
 3. The method of claim 1further comprising: verifying a checksum associated with the shaderbinary before execution by the graphics processor; and recompiling theshader to produce a replacement shader binary for execution by thegraphics processor and storing the shader binary in the memory if theverification fails.
 4. The method of claim 1 further comprising:generating a first global key; comparing the first global key to asecond global key previously stored in the memory; and invalidating allbinaries stored in the memory if the first global key mismatches thesecond global key.
 5. The method of claim 2, wherein a key is based onat least one of: shader arguments; shader text; graphics driver version;graphics processor type/version; and compiler version.
 6. The method ofclaim 1, wherein a shader binary is based on at least one of: shaderarguments; shader text; graphics driver version; graphics processortype/version; and compiler version.
 7. The method of claim 1, whereinthe storing the shader binary in the memory comprises storing anassociated key and an associated checksum with the shader binary in thememory.
 8. A computer system comprising: a processor; a graphicsprocessor; and a memory, wherein the memory is operable to storeinstructions, that when executed by the processor perform a method forcompiling a shader for execution by a graphics processor, the methodcomprising: selecting a shader for execution; computing a computed keyfor the selected shader; searching a memory for a copy of the computedkey; and passing a shader binary stored in the memory to the graphicsprocessor for execution if the copy of the computed key is located inthe memory, otherwise compiling the shader to produce a shader binaryfor execution by the graphics processor and storing the shader binary inthe memory, wherein the shader binary is associated with the computedkey and the copy of the computed key.
 9. The computer system of claim 8,wherein compiling the shader to produce a shader binary comprisescomputing an associated key to be stored with the shader binary in thememory.
 10. The computer system of claim 8, wherein the method furthercomprises: verifying a checksum associated with the shader binary beforeexecution by the graphics processor; and recompiling the shader toproduce a replacement shader binary for execution by the graphicsprocessor and storing the shader binary in the memory if theverification fails.
 11. The computer system of claim 8, wherein themethod further comprises: generating a first global key; comparing thefirst global key to a second global key previously stored in the memory;and invalidating all binaries stored in the memory if the first globalkey mismatches the second global key.
 12. The computer system of claim9, wherein a key is based on at least one of: shader arguments; shadertext; graphics driver version; graphics processor type/version; andcompiler version.
 13. The computer system of claim 8, wherein a shaderbinary is based on at least one of: shader arguments; shader text;graphics driver version; graphics processor type/version; and compilerversion.
 14. The computer system of claim 8, wherein the storing theshader binary in the memory comprises storing an associated key and anassociated checksum with the shader binary in the memory.
 15. A computersystem comprising: a compiler operable to compile and link shader sourcecode to create a shader binary; a memory operable to store a pluralityof shader binaries, wherein each shader binary is paired with anassociated key; and a graphics driver module operable to select one ormore shaders for execution by a graphics processor and to compute acomputed key for a selected shader, and is further operable to searchthe memory for a copy of the computed key and pass a shader binary fromthe memory for execution by the graphics processor if the copy of thecomputed key is located in the memory, otherwise the compiler isoperable to compile and link the shader to create a shader binary forexecution by the graphics processor and storing the shader binary in thememory, wherein the shader binary is associated with the computed keyand the copy of the computed key.
 16. The computer system of claim 15,wherein the computed key associated with the shader binary is alsostored in the memory.
 17. The computer system of claim 15, wherein thegraphics driver is further operable to: verify a checksum associatedwith the shader binary before execution by the graphics processor; andrecompile the shader source code to produce a replacement shader binaryfor execution by the graphics processor and storage in the memory if theverification fails.
 18. The computer system of claim 15, wherein thegraphics driver is further operable to compute a first global key,compare the first global key to a second global key previously stored inthe memory, and invalidate all shader binaries stored in the memory ifthe first global key mismatches the second global key.
 19. The computersystem of claim 16, wherein a key is based on at least one of: shaderarguments; shader text; graphics driver version; graphics processortype/version; and compiler version.
 20. The computer system of claim 15,wherein a shader binary is based on at least one of: shader arguments;shader text; graphics driver version; graphics processor type/version;and compiler version.